A syllable-scale framework for language identification
نویسندگان
چکیده
Whilst several examples of segment based approaches to language identification (LID) have been published, they have been typically conducted using only a small number of languages, or varying feature sets, thus making it difficult to determine how the segment length influences the accuracy of LID systems. In this study, phone-triplets are used as crude approximates for a syllable-length sub-word segmental unit. The proposed pseudo-syllabic length framework is subsequently used for both qualitative and quantitative examination of the contributions made by acoustic, phonotactic and prosodic information sources, and trialled in accordance with the NIST 1996 LID protocol. Firstly, a series of experimental comparisons are conducted which examine the utility of using segmental units for modelling short term acoustic features. These include comparisons between language specific Gaussian mixture models (GMMs), language specific GMMs for each segmental unit, and finally language specific hidden Markov models (HMM) for each segment, undertaken in an attempt to better model the temporal evolution of acoustic features. In a second tier of experiments, the contribution of both broad and fine class phonotactic information, when considered over an extended time frame, is contrasted with an implementation of the currently popular parallel phone recognition language modelling (PPRLM) technique. Results indicate that this information can be used to complement existing PPRLM systems to obtain improved performance. The pseudo-syllabic framework is also used to model prosodic dynamics and compared to an implemented version of a recently published system, achieving comparable levels of performance. 2005 Elsevier Ltd. All rights reserved. 0885-2308/$ see front matter 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.csl.2005.07.004 * Corresponding author. Tel.: +61 7 38641414. E-mail addresses: [email protected] (T. Martin), [email protected] (B. Baker), [email protected] (E. Wong), [email protected] (S. Sridharan). T. Martin et al. / Computer Speech and Language 20 (2006) 276–302 277
منابع مشابه
Syllable structure in Old, Middle and Modern Persian: A contrastive analysis
Evolution of languages has always been of interest to linguists. In this paper we study the natural progress of the syllable structure from Old Persian (O.P) to Middle Persian (Mi.P) and up to the Modern Persian (Mo.P). For this purpose all the words containing consonant sequences are collected from specific sources of each of these languages, and then analysed according to the syllab...
متن کاملLanguage identification of code switching Malay-English words using syllable structure information
This paper introduces a language identification approach using syllable structure information. We also review and compare other approaches. Most of these approaches use linguistic information for language identification. The information used for language identification is Malay affixation information, English vocabulary list, alphabet ngram, grapheme n-gram. The approach using syllable structur...
متن کاملPitch and energy trajectory modelling in a syllable length temporal framework for language identification
Recent studies have indicated that language identity is encapsulated in a more complicated manner to that represented by short term acoustic features. In particular, trajectory information over syllable-like durations have shown significant promise. This study introduces a novel three-tiered language identification approach which incorporates this information as well as acoustic context in the ...
متن کاملLanguage identification using acoustic log-likelihoods of syllable-like units
Automatic spoken language identification (LID) is the task of identifying the language from a short utterance of the speech signal uttered by an unknown speaker. The most successful approach to LID uses phone recognizers of several languages in parallel [Zissman, M.A., 1996. Comparison of four approaches to automatic language identification of telephone speech. IEEE Trans. Speech Audio Process....
متن کاملLanguage Identification by Using Syllable-Based Duration Classification on Code-Switching Speech
Many approaches to automatic spoken language identification (LID) on monolingual speech are successfully, but LID on the code-switching speech identifying at least 2 languages from one acoustic utterance challenges these approaches. In [6], we have successfully used one-pass approach to recognize the Chinese character on the Mandarin-Taiwanese code-switching speech. In this paper, we introduce ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 20 شماره
صفحات -
تاریخ انتشار 2006